class: center, middle, inverse, title-slide # Peer-reviewing Code in Statistical R Packages: Standards, Processes, and Tools ### Mark Padgham & Noam Ross
rOpenSci & EcoHealth Alliance
Münster, Germany & New York, USA --- class: center, middle, inverse background-image: url(data:image/png;base64,#img/ropensci-web.png) background-size: 100% background-position: 50% 50% -- .fonthuge[ropensci.org] --- class: center, middle, inverse background-image: url(data:image/png;base64,#img/ropensci-web-zoom.png) background-size: 100% background-position: 50% 50% --- class: center, middle, inverse background-image: url(data:image/png;base64,#img/ropensci-pkgs1.png) background-size: 100% background-position: 50% 50% --- class: center, middle, inverse background-image: url(data:image/png;base64,#img/ropensci-pkgs2.png) background-size: 100% background-position: 50% 50% --- class: center, middle, inverse background-image: url(data:image/png;base64,#img/ropensci-peer-review.png) background-size: 100% background-position: 50% 50% --- class: center, middle, inverse background-image: url(data:image/png;base64,#img/ropensci-stats-peer-review.png) background-size: 100% background-position: 50% 50% --- class: center, top background-image: url(data:image/png;base64,#img/SloanLogo-1B-SMALL-Gold-Blue.png) background-size: 90% background-position: 50% 50% -- # Expanding Software Peer Review --- class: left, top, inverse background-image: url(data:image/png;base64,#img/icon_lettering_color_large-dark.png) background-size: contain background-position: 50% 50% # Statistical Software Peer Review - ### ✅ 1. Define "Statistical Software" - ### ✅ 2. Develop Standards for Statistical Software - ### ✅ 3. Develop Tools to Aid Software Development - ### ✅ 4. Automate System for Software Submissions - ### ✅ 5. Open Peer Review System to Statistical Software <br><br> (Status: November 2021) --- class: left, top, inverse background-image: url(data:image/png;base64,#img/icon_lettering_color_large-dark.png) background-size: contain background-position: 50% 50% # Statistical Software Peer Review - ### ✅ 1. Define "Statistical Software" --- class: left, top background-image: url(data:image/png;base64,#img/icon_lettering_color_large-faded.png) background-size: contain background-position: 50% 50% # Statistical Software Peer Review ## What is Statistical Software? - Empirical Analysis - Data: All abstracts from major statistics conferences (JSM, SDSS, ... n ~ 20,000) - Methods: Natural Language Processing → concept clusters → network analyses - Results ... --- class: left, top background-image: url(data:image/png;base64,#img/icon_lettering_color_large-faded.png) background-size: contain background-position: 50% 50% # Statistical Software Peer Review ## What is Statistical Software? .left-column[ - Bayesian and Monte Carlo - Regression and Supervised Learning - Dimensionality Reduction, Clustering, and Unsupervised Learning - Exploratory Data Analysis (EDA) and Summary Statistics ] .right-column[ - Time Series Analyses - Machine Learning - Spatial Analyses - ~~Wrapper Packages~~ - ~~Network Analysis~~ - ~~Probability Distributions~~ - ~~Workflow Support~~ ] --- class: left, top, inverse background-image: url(data:image/png;base64,#img/icon_lettering_color_large-dark.png) background-size: contain background-position: 50% 50% # Statistical Software Peer Review - ### ✅ 1. Define "Statistical Software" -- - ### ✅ 2. Develop Standards for Statistical Software --- class: left, top background-image: url(data:image/png;base64,#img/icon_lettering_color_large-faded.png) background-size: contain background-position: 50% 50% # Statistical Software Standards ## Methodology: - Translate observed practices into principles - Write these principles as standards - Selectively elevate standards to *General* - Iterate, remove, add, refine, restructure standards --- class: left, top background-image: url(data:image/png;base64,#img/icon_lettering_color_large-faded.png) background-size: contain background-position: 50% 50% # Statistical Software Standards ## Results - Input from > 20 people - 40 General Standards - 265 Category-Specific Standards - 7 Categories - Four Further Categories in Development --- class: left, top background-image: url(data:image/png;base64,#img/icon_lettering_color_large-faded.png) background-size: contain background-position: 50% 50% # Statistical Software Standards ### [General & Category-Specific Standards](https://stats-devguide.ropensci.org/standards.html)<br>generally divided into standards for: - Documentation - Input Structures - Algorithms - Output Structures - Testing --- class: center, middle, inverse background-image: url(data:image/png;base64,#img/icon_lettering_color_large-dark.png) background-size: contain background-position: 50% 50% # stats-devguide.ropensci.org --- class: left, top background-image: url(data:image/png;base64,#img/icon_lettering_color_large-faded.png) background-size: contain background-position: 50% 50% # Statistical Software Standards ## Examples from General Standards: *Documentation* - ***G1.0*** *Statistical Software should list at least one primary reference from published academic literature.* - ***G1.2*** *Statistical Software should include a Life Cycle Statement describing current and anticipated future states of development.* --- class: left, top background-image: url(data:image/png;base64,#img/icon_lettering_color_large-faded.png) background-size: contain background-position: 50% 50% # Statistical Software Standards ## Examples from General Standards: *Algorithms* - ***G3.0*** *Statistical software should never compare floating point numbers for equality. All numeric equality comparisons should either ensure that they are made between integers, or use appropriate tolerances for approximate equality.* --- class: left, top background-image: url(data:image/png;base64,#img/icon_lettering_color_large-faded.png) background-size: contain background-position: 50% 50% # Statistical Software Standards ## Examples from Standards for Bayesian and Monte-Carlo Software:<br>*Pre-processing* Where appropriate, Bayesian Software should: - ***BS3.1*** *Implement pre-processing routines to diagnose perfect collinearity, and provide appropriate diagnostic messages or warnings* - ***BS3.2*** *Provide distinct routines for processing perfectly collinear data, potentially bypassing sampling algorithms* --- class: left, top, inverse background-image: url(data:image/png;base64,#img/icon_lettering_color_large-dark.png) background-size: contain background-position: 50% 50% # Statistical Software Peer Review - ### ✅ 1. Define "Statistical Software" - ### ✅ 2. Develop Standards for Statistical Software - ### ✅ 3. Develop Tools to Aid Software Development --- class: left, top, inverse background-image: url(data:image/png;base64,#img/package-flow.png) background-size: contain background-position: 50% 50% --- class: center, middle, inverse background-image: url(data:image/png;base64,#img/icon_lettering_color_large-dark.png) background-size: contain background-position: 50% 50% # The 'srr' package ## `srr` = "Software Review Roclets" (A "roclet" is a documentation engine,<br>or "doclet", for the **R** language.) ## github.com/ropensci-review-tools ## search: "ropensci srr" --- class: left, top background-image: url(data:image/png;base64,#img/icon_lettering_color_large-faded.png) background-size: contain background-position: 50% 50% # The 'srr' package: How? - [Enable standards compliance statements to be included within code itself, at the point(s) at which each standard is actually addressed.](https://ropensci-review-tools.github.io/roreviewapi/static/tsbox_srraad57cea.html) -- - One function dumps full text of standards in file `R/srr-stats-standards.R` - All are initially tagged `@srrstatsTODO` - Standards are addressed by changing tag to `@srrstats` and moving to location within code where standard is addressed --- class: left, top background-image: url(data:image/png;base64,#img/icon_lettering_color_large-faded.png) background-size: contain background-position: 50% 50% # The 'srr' package: How? - Reports on standards compliance automatically generated locally each time documentation is updated. ```r roxygen2::roxygenise(path) ``` ``` ## ℹ Loading demo ## ## ── rOpenSci Statistical Software Standards ──────────────────────────── ## ## ── @srrstats standards: ## • [G1.0] in function 'addtwo()' on line#11 of file [R/test.R] ``` --- class: left, top background-image: url(data:image/png;base64,#img/icon_lettering_color_large-faded.png) background-size: contain background-position: 50% 50% # The 'srr' package: Summary - Insert standards into package with one line of code - Comply with standards - Update documentation to automatically generate compliance report - Generate hyperlinked `html` version with `srr_report()` function - Also extensible, with alternative standards<br>substituted by modifying a single URL --- class: left, top, inverse background-image: url(data:image/png;base64,#img/package-flow.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/gittargets1.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/gittargets2.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/gittargets3.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/gittargets4.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/gittargets5.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/gittargets6.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/gittargets7.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/gittargets5.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/gittargets8.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/gittargets1.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/bot.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/bot2.png) background-size: contain background-position: 50% 50% --- class: left, top, inverse background-image: url(data:image/png;base64,#img/comm-call.png) background-size: contain background-position: 50% 50% --- class: left, middle, inverse background-image: url(data:image/png;base64,#img/icon_lettering_color_large-dark.png) background-size: contain background-position: 50% 50% # Statistical Software Peer Review - Now accepting submissions - Search "rOpenSci statistical software" - `stats-devguide.ropensci.org` - All tools available for use now at `github.com/ropensci-review-tools` - Please attend our community call: 7 Dec 2021